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Abstract 

Objective: To develop novel, scalable, and valid literacy profiles for identifying lim- 
ited health literacy patients by harnessing natural language processing. 

Data Source: With respect to the linguistic content, we analyzed 283 216 secure 
messages sent by 6941 diabetes patients to physicians within an integrated system's 
electronic portal. Sociodemographic, clinical, and utilization data were obtained via 
questionnaire and electronic health records. 

Study Design: Retrospective study used natural language processing and machine 
learning to generate five unique “Literacy Profiles” by employing various sets of lin- 
guistic indices: Flesch-Kincaid (LP_FK); basic indices of writing complexity, including 
lexical diversity (LP_LD) and writing quality (LP_WQ); and advanced indices related 
to syntactic complexity, lexical sophistication, and diversity, modeled from self-re- 
ported (LP_SR), and expert-rated (LP_Exp) health literacy. We first determined the 
performance of each literacy profile relative to self-reported and expert-rated health 
literacy to discriminate between high and low health literacy and then assessed 
Literacy Profiles’ relationships with known correlates of health literacy, such as pa- 
tient sociodemographics and a range of health-related outcomes, including ratings 
of physician communication, medication adherence, diabetes control, comorbidities, 
and utilization. 

Principal Findings: LP_SR and LP_Exp performed best in discriminating between high 
and low self-reported (C-statistics: 0.86 and 0.58, respectively) and expert-rated 
health literacy (C-statistics: 0.71 and 0.87, respectively) and were significantly associ- 
ated with educational attainment, race/ethnicity, Consumer Assessment of Provider 
and Systems (CAHPS) scores, adherence, glycemia, comorbidities, and emergency 
department visits. 

Conclusions: Since health literacy is a potentially remediable explanatory factor 


in health care disparities, the development of automated health literacy indicators 
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represents a significant accomplishment with broad clinical and population health 


applications. Health systems could apply literacy profiles to efficiently determine 


whether quality of care and outcomes vary by patient health literacy; identify at-risk 


populations for targeting tailored health communications and self-management sup- 


port interventions; and inform clinicians to promote improvements in individual-level 


care. 
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1 | INTRODUCTION 


Patient-physician communication is a fundamental pillar of care that 
influences patient satisfaction and health outcomes,’ particularly in 
diabetes mellitus.2 More than 30 million US adults are living with 
diabetes,? and one quarter to one third of them has limited health 
literacy skills. Limited health literacy is associated with untoward 
and costly diabetes outcomes that contribute to health disparities.*°° 
Limited health literacy impedes physician-patient communication, 
as well as imparts a barrier to patients’ learning and understanding 
across numerous communication domains.”*° 

Being able to assess patients’ health literacy is of interest to 
clinicians, delivery systems, and the public health community.’ 
Clinicians often are unaware of the health literacy status of their 
patients and have been found to both be receptive to receiving 
this information as well as responsive.’* Ignoring differences 
in health literacy in population management has been shown to 
amplify health literacy-related disparities.° To date, identifying 
limited health literacy patients has proven painstaking and infea- 
sible to scale.*® Because “big data’—in this case data derived from 
patients’ written secure messages sent via patient portals—are 
increasingly available, we sought to determine whether natural 
language processing tools and machine learning approaches can 
be utilized to identify patients with limited health literacy. An au- 
tomated process, if it could generate health literacy estimates with 
sufficient accuracy, would provide an efficient means to identify 
patients with limited health literacy, with a number of implications 
for improving health services delivery. The few formulas used in 
prior health literacy studies of written text (eg, Flesch-Kincaid, 
SMOG) depend on surface-level lexical and sentential features, 
have not examined secure messages, and have not used natural 
language processing and machine learning. We are aware of only 
two studies that attempted to identify patient health literacy using 
secure message content.’*?> Both studies developed predictive 
models of health literacy using natural language processing based 
on linguistic features extracted from secure messages. These 
“Literacy Profiles” were generated from patients’ self-reported 
health literacy* and expert ratings of health literacy based on se- 


cure message quality’? and showed promising results. 


WHAT IS KNOWN ON THIS TOPIC 


e Limited health literacy is associated with untoward and 
costly health outcomes that contribute to health dispari- 
ties, and poor communication exchange is an important 
mediator in the relationship between limited health lit- 
eracy and health outcomes. 

e Given the time and personnel demands intrinsic to cur- 
rent health literacy instruments, combined with the sen- 
sitive nature of screening, measuring health literacy is 
both challenging and controversial. 

e Electronic patient portals are an increasingly popular 
channel for patients and providers to communicate via 
secure messaging, and secure messages contain linguis- 
tic content that could be anlayzed to measure patient 


health literacy. 


WHAT THIS STUDY ADDS 


e Two valid literacy profiles from patients’ secure mes- 
sage were generated by applying computational linguis- 
tics approaches to “big linguistic data”, creating a novel, 
feasible, automated, and scalable strategy to identify 
patients and subpopulations with limited health literacy. 

e Literacy profiles can provide a health IT tool to enable 
tailored communication support and other targeted 
interventions with potential to reduce health literacy- 


related disparities. 


To advance methods for identifying patients’ health literacy auto- 
matically, we built on this prior work, and developed and compared 
five literacy profiles based on distinct theoretical models and asso- 
ciated natural language processing (NLP) tools and machine learning 
techniques. The primary goal of the current study was to compare the 
relative performance of these literacy profiles with respect to their 


ability to discriminate between limited vs. adequate health literacy in 
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Svy responders 
a.svyfinal 


MRN #-20,188 


2006<= year(msg_date)<=2015 


n.one_msg 
MRN #12,380 


SM #=1,148,831 


1. System generated, n=27,268 
2. Questionnaire, n=1,090 
3. No msg contents found, n=69,896 


Eligible for proxy algorithm? 


NO 


not eligible for 
proxy algorithm 
MRN #238 


Sample for LP 


score development 


MRN #=10,154 


n.notes_v3 
MRN #=12,286 
SM #=1,050,577 
PCP_ID # 15,727 
Thread id #-547,226 


TOFROM_PAT_C=2 


SM from patient 
MRN #=10,742 
SM #=454,484 


PCP ID 10,646 
Thread id #=338,108 


YES: 
Patients had at least 50 words in their 
aggregated secure messages 


n.proxy 
MRN #=10,504 
SM #=441,615 


Identified as proxy user? 


No proxy SM n.proxy_pcet 

found MRN #=4,798 
MRN #=5,706 SM #=357,606 
SM #=84,009 


e Remove patients with >=50% proxy 
oe written SM (n=588) 


Patients with proxy 
SM <50% 
MRN #=4,210 


Patients and messages removed that did not 
have all the values for the self-reported health 
literacy measures 


LP, MRN #6941, SM #=283,216 


FIGURE 1. Patient and secure messages inclusion/exclusion flowchart*. *MRN#: Patient ID; msg_date: Date of message sent; Svy: survey; 
SM#: number of secure messages; LP: literacy profile; PCP_ID: primary care provider ID; proxy_pct: % of proxy messages; TOFROM_PAT_C: 


SM sent by the patient 


a large sample of diabetes patients based on a large written corpus 


of patients’ secure messages. The secondary goal was to assess the 


extent to which these different literacy profiles are associated with and a range of diabetes-related health outcomes. 


patterns that mirror previous research in terms of their relationships 


with patient sociodemographics, ratings of physician communication, 
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2 | METHODS 
2.1 | Data sources and participants 


Our sampling frame included over one million secure messages 
generated by >150 000 ethnically diverse diabetes patients in the 
Kaiser Permanente Northern California (KPNC) Diabetes Registry, 
and >9000 primary care physicians. KPNC is a fully integrated health 
system that provides care to ~4.4 million patients and supports a 
well-developed and mature patient portal (kp.org). 

The current study includes the subset of the KPNC registry pa- 
tients who completed a 2005-2006 survey as part of the Diabetes 
Study of Northern California (DISTANCE) and responded to the 
self-reported health literacy items on the associated question- 
naire (N = 14 357).2%!7 DISTANCE surveyed diabetes patients, 
oversampling minority subgroups to assess the role of sociodemo- 
graphic factors on quality and outcomes of care. The average age 
of the study population at the time was 56.8 (+10); 54.3 percent 
were male; and 18.4 percent Latino, 16.9 percent African American, 
22.8 percent Caucasian, 11.9 percent Filipino, 11.4 percent Asian 
(Chinese or Japanese), 7.5 percent South Asian/Pacific Islander/ 
Native American/Eskimo, and 11.0 percent multi-racial. Variables 
were collected from questionnaires completed via telephone, on- 
line or paper and pencil (62 percent response rate). Details of the 
DISTANCE Study have been reported previously.’” 

We first extracted all secure messages (N = 1 050 577) ex- 
changed from 01/01/2006 through 12/31/2015 between diabe- 
tes patients and all clinicians from KPNC’s patient portal. For the 
current analyses, only those secure messages that a patient sent 
to his or her primary care physician were included. We excluded 
all secure messages: from patients who did not have matching 
DISTANCE survey data; written in a language other than English; 
and written by proxy caregivers (determined by the KP.org proxy 
check-box or by a validated NLP algorithm?®). The study flowchart 
(Figure 1) shows details about inclusion/ exclusion of patients 
and associated secure messages. The final dataset consisted of 
283 216 secure messages sent by 6941 patients to their primary 
care physicians. 

The number of individual SMs sent by a patient to their physi- 
cian(s) ranged between 2 and 205, and the mean number of SMs 
sent was 39.88. For each patient, all secure messages were then 
collated into a single file. The length of patients’ aggregated SMs 
ranged from 1 word and 16 469 words, with a mean length of 
2058.95 words. To provide appropriate linguistic coverage to de- 
velop literacy profiles, we excluded patients whose aggregated se- 
cure messages lacked sufficient words (<50 words, see Figure 1), 
a threshold based on previous NLP text research in learning ana- 
lytics domains.!7:2° 

This study was approved by the KPNC and UCSF Institutional 
Review Boards (IRBs). All analyses involved secondary data and all 
data were housed on a password-protected secure KPNC server 


that could only be accessed by authorized researchers. 


2.2 | Health literacy “Gold Standards” 


DISTANCE survey included three validated health literacy items 
that measure self-efficacy in specific health literacy competencies 
using a 5-point Likert scale in which a response of 1 referred to 
“Always” and a response of 5 to “Never”.*+ Questions include self- 
reported confidence in filling out medical forms, problems under- 
standing written medical information, and frequency of needing 
help in reading and understanding health materials. We combined 
these items to create a self-reported health literacy variable to 
compare performance of the linguistic models,’* by averaging 
scores across the health literacy items. Average scores were di- 
chotomized to create binary data, with scores <4.5 indicating lim- 
ited health literacy and 24.5 indicating adequate health literacy.** 
The threshold was determined based on the distribution of these 
average scores to maintain the appropriate balance between the 
health literacy categories and is consistent with prior studies that 
have employed these measures.11621 

We also generated health literacy scores based on expert rat- 
ings of the quality of patients’ secure messages. These ratings 
used a subset of the DISTANCE sample, comprised of aggregated 
secure messages written by 512 patients purposively sampled 
to represent a balance of self-reported health literacy, as well 
as a range of age, race/ethnicity, and socio-economic status. A 
health literacy scoring rubric was used to holistically assess the 
perceived health literacy of the patients based on their secure 
messages, adapting an established rubric used to score the writ- 
ing abilities of high school students entering college.?? An ordinal 
scale ranging from 1 to 6 *° assessed the extent to which patients’ 
secure messages demonstrated mastery of written English, orga- 
nization and focus, and a varied, accurate, and appropriate health 
vocabulary to enable clear access to the health-related content 
and ideas the patient intended to express to their physician. 
Because of limited relevance to the construct of health literacy, 
we removed aspects of the rubric related to length, developing 
point of views, and discourse-related elements important in ar- 
gumentative writing including the use of examples, reason, and 
evidence. Two raters experienced in linguistics and health liter- 
acy research were trained twice on 25 separate aggregated se- 
cure messages not included in the 512 messages used in the final 
analysis. After reaching a satisfactory inter-rater reliability (IRR, 
r > .70), raters independently scored the 512 messages. Secure 
messages were categorized into two groups: limited health liter- 
acy (scores < 4, n = 200) and adequate health literacy (scores 2 4, 
n= 312). 

We examined for existence of any associations between the 
self-reported and expert-rated health literacy measures before 
these two measures were employed to train literacy profiles. The 
Cramer's V and chi-squared tests were used to measure the strength 
of association and significance of that association (effect size). The 
two variables were significantly different (P = .01) and only weakly 
correlated (r = 0.118, P = .001). 
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TABLE 1 Linguistic indices used in five 


3 ‘ Lit fil 
literacy profiles? chaos ACL 


LP_FK 


LP_LD 
LP_WQ 


LP_SR 


LP_Exp 


Linguistic indices 


Readability 


Lexical Diversity 


Word Frequency 


Syntactic Complexity 


Lexical Diversity 


Concreteness 


Lexical diversity 


Present tense 
Determiners 
Adjectives 


Function words 


Age of Exposure 


Lexical decision 
response time 


Attested lemmas 


Determiner per 
nominal phrase 


Dependents per 
nominal subject 


Number of associations 
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Description 


The length of words (ie, number of letters or 
syllables) and length of sentences (ie, number 
of words) 


The variety of words used in a text based on D 
Frequency of word in a reference corpus 


Number of words before the main verb ina 
sentence 


The variety of words used in a text based on 
MTLD 


The degree to which a word is concrete 


The variety of words used in a text based on 
two measures of lexical diversity: MTLD, 
and D 


Incidence of present tense 
Incidence of determiners (eg, a, the) 
Incidence of adjectives 


Incidence of function words such as 
prepositions, pronouns etc 


The estimated age at which a word first 
appears in a child's vocabulary 


The time it takes for a human to judge a string 
of characters as a word 


Number of attested lemmas used per verb 
argument construction 


Number of determiners in each noun phrase 


Number of structural dependents for each 
subject in a noun phrase 


Number of words strongly associated with a 
single word 


Abbreviations: LP_Exp, Literacy Profile Expert-Rated Health Literacy; LP_FK, Literacy Profile 
Flesch-Kincaid; LP_LD, Literacy Profile Lexical Diversity; LP_SR, Literacy Profile Self-Reported 
Health Literacy; LP_WQ, Literacy Profile Writing Quality. 


*"We present examples of linguistic indices for LP_SR (n = 185) and LP_Exp (n = 8). 


2.3 | Natural language processing (NLP) tools 


The linguistic features we examined were derived from the pa- 
tients’ secure messages using several NLP tools that measure differ- 
ent language aspects, such as text level information (eg, number of 
words in the text, type-token ratio), lexical sophistication (eg, word 
frequency, concreteness), syntactic complexity (embedded clause 
and phrasal complexity), and text cohesion (eg, connectives, word 
overlap). These tools were selected because they measure linguistic 
features that are important aspects of literacy, including text com- 
plexity, readability, and cohesion. The tools included the Tool for the 
Automatic Assessment of Lexical Sophistication,2>7* the Tool for 
the Automatic Analysis of Cohesion,2° the Tool for the Automatic 
Assessment of Syntactic Sophistication and Complexity,2°?” the 
SEntiment ANalysis and Cognition Engine,“ and Coh-Metrix.?? 
These open-access tools rely on several NLP packages to pro- 
cess text including the Stanford Parser,°° the British National 


Corpus,?? the MRC psycholinguistic database,°? Collins Birmingham 


University International Language Database frequency norms, 


33 and 


Wordnet.*4 These tools have been developed using Python and Java. 
To generate word frequencies for medical terminology, we used 
medical corpora from HIstory of MEdicine coRpus Annotation® and 


Informatics for Integrating Biology and the Bedside.°*°” 


2.4 | Literacy profiles developed 


Using the patients’ secure messages, we applied natural language 
processing and machine learning techniques to develop five sepa- 
rate literacy profile prototypes for categorizing both patients’ self- 
reported and expert-rated health literacy. As a result, the literacy 
profiles differed based on the dependent health literacy variable 
(self-reported or expert-rated) and the linguistic features that were 
used as independent variables to develop these literacy profiles. 
Each literacy profile is briefly discussed below, and the component 


linguistic indices used for each are summarized in Table 1. 
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2.4.1 | Literacy Profile Flesch-Kincaid (LP_FK) 


As a “baseline” literacy profile, we calculated Flesch-Kincaid read- 
ability scores*° for the secure messages. We used Flesch-Kincaid as 
a baseline measure because it is one of the most commonly used and 
widely available readability formulas in medical domain, including 
assessing the readability and comprehensibility of a broad range of 
medical information.**7° Flesch-Kincaid is based on average number 


of words per sentence and average number of syllables per word. 


2.4.2 | Literacy Profile Lexical Diversity (LP_LD) 


We used lexical diversity as an additional baseline measure because 
it is a commonly used and a straightforward method for assessing 
writing proficiency in the linguistics domain and it captures both 
lexical richness and text cohesion.*”“8 Both these features are con- 
sistent predictors of text sophistication and writing quality.2”7? We 
calculated lexical diversity based on a type-token ratio (TTR) meas- 
ure. TTR measures assess lexical variety based on the number of 
words produced (tokens) divided by the number of unique words 
produced (types), to evaluate writers’ lexical production. TTR meas- 
ure D*” was used because it controls for text length by calculating 
probability curves that mathematically model how new words are 


introduced into increasingly large language samples. 


2.4.3 | Literacy Profile Writing Quality (LP_WQ) 


We used a previously validated model”? 


to classify secure messages as 
either low or high in terms of writing quality. The model, derived from 
three linguistic indices of word frequency, syntactic complexity, and lex- 


29,48 


ical diversity, reveals that higher-level writers use more infrequent 


and lexical diverse words and more syntactically complex structures. 


2.4.4 | Literacy Profile Self-Reported Health 
Literacy (LP_SR) 


A set of 185 linguistic features was calculated from the patients’ 
secure messages and used to predict patients’ self-reported health 
literacy scores. A subset of the linguistic indices used for develop- 
ing this literacy profile are provided in Table 1. The rationale, de- 
velopment, and experimental design for LP_SR have been briefly 
discussed in the Health Literacy “Gold Standards” section, and the 


details have also been previously reported."4 


2.4.5 | Literacy Profile Expert-Rated Health Literacy 
(LP_Exp) 


A set of eight linguistic indices, including lexical decision latencies, 


age of exposure, word naming response times, academic word lists, 


bigrams association strength, and dependency structures, were 
used as independent variables to predict human ratings of health lit- 
eracy from the purposively sampled subset of 512 secure messages 
used in the LP_SR analysis (Table 1). Additional details related to the 
development and experimental design of LP_Exp have been previ- 
ously reported’? and can also be found in the Health Literacy “Gold 


Standards” section. 


2.5 | Assessing performance of literacy profiles 
against gold standards 


We compared the performance of the five literacy profiles using 
several supervised machine learning classification algorithms: lin- 
ear discriminant analysis (LDA), random forests, support vector 
machine (SVM), naive Bayes, and neural networks. In a supervised 
machine learning model, the algorithm learns from a labeled data- 
set, providing an answer key that the algorithm can use to classify 
unseen data and evaluate its accuracy. There are two main areas 
where supervised machine learning is useful: classification and 
regression. Classification problems ask the algorithm to predict a 
discrete value, identifying the input data as the member of a class, 
or group.2*>° The models in this study were trained and tested 
using Weka (version 3.8.1) and R (version 3.3.2) implementations. 
We first examined performance between the self-reported health 
literacy and other literacy profiles, and then between the expert 
ratings of health literacy and all other literacy profiles. We report 
results for the models using support vector machines, because it 
yielded the best results for all the literacy profiles. Using a ran- 
domly allocated split-sample approach, we report discriminatory 
performance results using c-statistics (area under the receiver 
operator [ROC] curves), sensitivity, specificity, and positive and 
negative predictive values (PPV and NPV). 


2.6 | Assessing criterion-related validity for 
literacy profiles 


We examined associations between the health literacy classifica- 
tions generated by the literacy profiles and known correlates of 
health literacy including patients’ educational attainment, race/ 
ethnicity, and age. Because of the known association between 
limited health literacy and suboptimal patient-provider communi- 


cation,’ *° 


we also examined relationships with patients’ reports 
of physician communication using an adapted version of the most 
health literacy-relevant item from the 4-item CAHPS survey’: “In 
the last one year, how often have your physician and health care 
providers explained things in a way that you could understand?”. 
We defined communication as “poor” if the patient reported that 
his or her doctor and health care team “never” or “sometimes” 
explained things in a way that he/she could understand.* We 
also examined the extent to which each literacy profile was as- 


sociated with diabetes-related outcomes previously found to 
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be associated with health literacy. These included adherence to 
cardio-metabolic medications based on continuous medication 
gaps (CMG),°”°° a validated measure based on percent time with 
insufficient medication supply; hemoglobin A1c (HbA‘1c), an inte- 
grated measure of blood sugar control, measured both as optimal 
(HbAic < 7 percent) and poor control (HbAic 2 9 percent); 2=1 
clinically relevant hypoglycemic episodes (an adverse drug event 
associated with diabetes treatment)’; and comorbidities, using 
the Charlson index°?*! (Deyo version).°* HbAic reflected the 
value collected after the first secure message was sent. CMG, 
hypoglycemia, and Charlson index were measured the year be- 
fore the first secure message. The occurrence of one or more hy- 
poglycemia-related ED visits or hospitalizations in the year prior 
was based on a validated algorithm that uses specific diagnostic 
codes.°° Finally, we explored relationships between each literacy 
profile and outpatient, emergency room, and hospitalization utili- 
zation data 12 months prior to the first secure message date. For 
all analyses, we examined bivariate associations between each of 
the literacy profiles and sociodemographics, the single CAHPS 
item, and health outcomes using a two-sided p-value at the 0.05 
level. Categorical variables such as education, race, adherence, 
HbAi1c levels, and hypoglycemia were analyzed using chi-square 
analysis. For comorbidity and health care utilization rates, mean 


comparisons were conducted using t tests. 


HS 


3 | RESULTS 
3.1 | Criterion-related validity of literacy profiles 
based on performance against gold standards 


3.1.1 | Performance with respect to self-reported 
health literacy 


When self-reported health literacy was the dependent variable, 
LP_SR performed well in terms of its ability to discriminate between 
those with limited vs. adequate health literacy, with a c-statistic of 
0.86; sensitivity was high, specificity was modest (0.67), and PPV 
and NPV were acceptable. All other literacy profiles performed 
poorly (Figure 2). LP_Exp, while performing slightly better than LP_ 
FK, LP_LD, and LP_WQ, yielded a c-statistic of 0.58 and sensitivity 
in the intermediate range; specificity, PPV, and NPV values were all 


low. 


3.1.2 | Performance with respect to expert-rated 
health literacy 


When expert-rated health literacy was the dependent variable, 


LP_Exp performed best in terms of its ability to discriminate 


8 
i) 
Bo 
> © 
cS AUC: 85.6% 
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FIGURE 2 ROCs and performance metrics for the literacy profiles relative to self-reported health literacy. AUC: Area Under Curve; 
LP_Exp: Literacy Profile Expert-Rated Health Literacy; LP_FK: Literacy Profile Flesch-Kincaid; LP_LD: Literacy Profile Lexical Diversity; 
LP_SR: Literacy Profile Self-Reported Health Literacy; LP_WQ: Literacy Profile Writing Quality; ML: Machine Learning; SVM: Support Vector 


Machine [Color figure can be viewed at wileyonlinelibrary.com] 
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between those with limited vs. adequate health literacy, with a 


c-statistic of 0.87, high sensitivity, moderate specificity, and PPV, 
and NPV in an acceptable range. LP_FK and LP_LD each per- 
formed poorly (Figure 3). LP_WQ performed better, with all the 
performance metrics > 0.70 except for specificity. Performance 
metrics for LP_SR were sub-optimal, with a c-statistic of 0.71, 
intermediate sensitivity, moderate specificity, and PPV, but low 
NPV. 


3.2 | Predictive validity based on associations with 
sociodemographics, communication ratings, and 
health-related outcomes 


3.2.1 | Sociodemographics 


We found patterns that mirrored previously observed health liter- 
acy-related relationships, with considerable variation across patient 
characteristics and literacy profile type. Table 2 shows the educa- 
tional attainment (% with college degree vs. less), race (% white vs. 
non-white), and the mean age among patients predicted to have ade- 
quate vs. inadequate health literacy for each of the five literacy pro- 
files. All literacy profiles generated classifications in which limited 


health literacy was associated with non-white race. LP_LD, LP_SR, 


100 


and LP_Exp each were associated with lower education, with the 
strongest effects observed for LP_SR and LP_Exp. Only LP_SR was 


associated with older patient age. 


3.2.2 | Provider communication 


The proportion of patients identified as having limited or adequate 
health literacy and who reported poor physician communication is 
shown in Table 3. Those patients predicted to have limited health 
literacy by LP_LSR and LP_Exp only were significantly more likely to 
rate their health care providers as “poor” on the CAHPS item, with 


somewhat more robust findings for LP_SR. 


3.2.3 | Health outcomes 


Limited health literacy as categorized only by the three literacy pro- 
files (LP_FK, LP_SR, and LP_Exp) was associated with poor cardio- 
metabolic medication adherence, serious hypoglycemia and greater 
comorbidity. Poor medication adherence was most robustly associ- 
ated with LP_FK and LP_Exp. Limited health literacy as measured 
only by LP_FK and LP_Exp was associated with both optimal and 


poor diabetes control. 


© 
Zio 
2 © 
. - AUC: 87.1% 
5 * AUC: 74.1% 
AUC: 71.3% 
oO 
a AUC: 57.8% 
AUC: 55.7% 
Oo 
50 0 50 100 150 
100 - Specificity (%) 
LP Number of Indices ae C-statistic Sensitivity Specificity PPV NPV 
LP FK Readability (1) LDA 0.557 0.9968 0.0050 0.6098 0.5000 
LP LD oo Dierty ung 0.578 0.9839 0.1000 0.6296 0.6296 
Lk wo “meee SVM 0.741 0.8718 0.4950 0.7292 0.7122 
variables (3) 
psp 4 Lapse SVM 0.713 0.6945 0.6000 0.7297 0.5581 
= variables (185) 
LP Exp Linguistic SVM 0.871 0.9199 0.7150 0.8343 0.8512 
= variables (8) 


FIGURE 3 ROCs and performance metrics for the literacy profiles relative to expert-rated literacy. AUC: Area Under Curve; LDA: Linear 
Discriminant Analysis; LP_Exp: Literacy Profile Expert-Rated Health Literacy; LP_FK: Literacy Profile Flesch-Kincaid; LP_LD: Literacy Profile 
Lexical Diversity; LP_SR: Literacy Profile Self-Reported Health Literacy; LP_WQ: Literacy Profile Writing Quality; ML: Machine Learning; 
SVM: Support Vector Machine [Color figure can be viewed at wileyonlinelibrary.com] 
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TABLE 2 Prevalence of sociodemographic characteristics by literacy profile 


Education—College degree % Race—White % Age at Survey—Mean (SD) 


Limited Adequate Limited Adequate 

health health P- health health P- Limited health Adequate P- 
Literacy profile literacy literacy value literacy literacy value literacy health literacy value 
LP_FK 66.3 60.4 .076 22.0 337 <.001 56.60 (11.4) 57.29 (9.89) .305 
LP_LD 68.9 59.9 [OO 22 14 32.6 <.001 56.72 (9.73) 56.74 (10.0) 966 
LP_WQ 60.4 57.4 .070 = 31.7 34.8 056 56.71 (9.89) 57.44 (10.2) .032 
LP_SR 7253 58h7/ {Oo eh) 36.5 <.001 58.88 (9.98) 55.74 (9.74) <.001 
LP_Exp 71,2 57.4 <.001 23.1 33.1 <.001 56.70 (10.2) 57.80 (10.0) <.001 


Abbreviations: LP_Exp, Literacy Profile Expert-Rated Health Literacy; LP_FK, Literacy Profile Flesch-Kincaid; LP_LD, Literacy Profile Lexical Diversity; 
LP_SR, Literacy Profile Self-Reported Health Literacy; LP_WQ, Literacy Profile Writing Quality; SD, Standard Deviation. 


TABLE 3 Associations between five literacy profiles and single-item CAHPS ratings of poor physician communication, diabetes-related 
health outcomes (%), and annual health care service utilization—mean visits (SD) 


Health outcomes Literacy profile LP_FK LP_LD LP_WQ LP_SR LP_Exp 
Poor Physician Limited health literacy 12.2 10.6 92 13.8 15.5 
Communication (%) adequate health literacy 8.8 9.6 10.7 7.3 11.3 
P-value .0919 .5610 1372 <.001 <.001 
Poor medication Limited health literacy 29.0 26.6 239) 25.6 279 
adherence (%) Adequate health literacy 22.9 23.8 25.3 23.4 22.9 
P-value .043 .277 364 .047 <.001 
21 Severe Limited health literacy 8.9 4.3 29 5.1 4.3 
Hypoglycemia (%) Adequate health literacy 3.1 3.3 3.9 2.0 3.4 
P-value <.001 .318 119 <.001 02 
HbA1c < 7% Limited health literacy 40.2 44.8 474 45.9 43.3 
Adequate health literacy 48.8 48.5 45.2 47.7 47.4 
P-value 011 .202 Alls} 141 <.001 
HbA1c 2 9% Limited health literacy 19.1 13.3 13.8 14.6 16.4 
Adequate health literacy 13.7 13.5 14.6 13:5 13.0 
P-value 02 91 499 .24 <.001 
Charlson Index Limited health literacy 2.61 (1.84) 2.36 (1.69) 2.20 (1.61) 2.65 (1.91) 2.42 (1.79) 
Adequate health literacy 2.28 (1.68) 2.31 (1.69) 2.40 (1.72) 2.02 (1.41) 2.32 (1.70) 
P-value .004 .636 <.001 <.001 .006 
Outpatient clinic Limited health literacy 9.10 (7.37) 9.01 (8.98) 9.45 (9.75) 10.29 (10.7) 9.83 (10.6) 
visits Adequate health literacy 9.53 (9.33) 9.61 (10.3) 9.42 (9.53) 9.01 (9.16) 9.68 (9.57) 
P-value 479 301 931 <.001 499 
ED visits Limited health literacy 0.48 (1.00) 0.47 (1.15) 0.38 (0.94) 0.53 (1.20) 0.47 (1.14) 
Adequate health literacy 0.39 (0.94) 0.38 (0.88) 0.43 (0.96) 0.31 (0.76) 0.42 (1.01) 
P-value .170 102 .085 <.001 .016 
Hospitalization Limited health literacy 0.23 (0.71) 0.21 (0.61) 0.17 (0.60) 0.25 (0.73) 0.20 (0.65) 
Adequate health literacy 0.18 (0.62) 0.19 (0.65) 0.19 (0.67) 0.13 (0.54) 0.20 (0.67) 
P-value .243 .604 .503 <.001 713 


Abbreviations: Exp, Expert-Rated; FK, Flesch-Kincaid; LD, Lexical Diversity; LP, Literacy Profile; SD, Standard Deviation; SR, Self-Reported; WQ, 
Writing Quality. 


3.2.4 | Health care utilization literacy, LP_LSR was the only model that associated inadequate 
health literacy with higher rates of outpatient visits and hospi- 
Utilizations rates associated with each of the five literacy profiles talizations. Higher annual emergency room utilization rates were 


are given in Table 3. For those classified as having limited health observed for limited health literacy when assessed by both LP_SR 
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and LP_Exp, with health literacy-related differences more robust 
for LP_SR. 


4 | DISCUSSION 


Generating accurate information on a population's health literacy, or 
on an individual patient's health literacy, through the use of an au- 
tomated literacy profile efficiently provides new avenues that can 
both inform health services research as well as improve health ser- 
vices delivery and population management. The main added value of 
our approach is that it supplants the requirement to assess patients’ 
health literacy one at a time; any effort required to operationalize 
our system provides tremendous economies of scale. As such, a scal- 
able, automated measure of health literacy has the potential to en- 
able health systems to (a) efficiently determine whether quality of 
care and health outcomes vary by patient health literacy; (b) identify 
populations and/or individual patients at risk of miscommunication 
so as to better target and deliver tailored health communications and 
self-management support interventions; and (c) inform clinicians so 
as to promote improvements in individual-level care. A 2012 report 
from the National Academy of Medicine called for health systems 
to measure the extent to which quality and outcomes differ across 
patient health literacy level so that systems can take steps to reduce 
such disparities and track the success of these quality improvement 
efforts. However, to date, no measure of health literacy has been 
available to enable such comparisons. In addition, prior research has 
shown that delivering health literacy-appropriate communication in- 
terventions can disproportionately benefit those with limited health 
literacy skills or narrow extant health literacy-related disparities in 
such common conditions such as diabetes, heart failure, asthma, and 
end-of-life care.”1°° © But translation of this work into real-world 
settings has been hampered, in part, by the inability to efficiently 
scale the identification of limited health literacy so as to facilitate 
targeting those most in need. Health systems are increasingly inter- 
ested in incorporating predictive analytics as a means of risk strati- 
fying and targeting care. Harnessing “big (linguistic) data” by using 
natural language processing and machine learning approaches to 
classify levels of health literacy could open up new avenues to en- 
hance population management as well as individualize care. Failure 
to do so in population management interventions has previously 
been shown to amplify health literacy-related disparities.° Finally, 
prior studies have demonstrated that clinicians often overestimate 
the health literacy status of their patients.’2 However, when their 
patients have been screened for health literacy, primary care physi- 
cians have been shown to be receptive to this information and, once 
they have learned that a patient has limited health literacy, physicians 
have been shown to engage in a range of communication behaviors 
that can promote better comprehension and adherence. The transla- 
tional implications of the research on physician behavior have been 
limited due, in part, to the lack of efficient and scalable measures of 
health literacy, as well as physicians’ reports that in order for them 


to best respond, they would need additional system-level support. 


The current study compared the performance of five literacy 
profiles generated from linguistic features extracted using natural 
language processing and trained using machine learning techniques. 
While natural language processing and machine learning tools have 
previously been employed in a variety of health care research applica- 


tions,’”0®° 


our research is one of the first to attempt to do so to classify 
patients’ health literacy. We determined that, by applying innovative 
computational linguistics approaches we were able to generate two 
automated literacy profiles that (a) have sufficient accuracy in classify- 
ing levels of either self-reported health literacy (LP_SR) or expert-rated 
health literacy (LP_Exp), and (b) reveal confirmatory patterns with 
sociodemographic, communication and health variables previously 
shown to be associated with health literacy. The findings that LP_SR 
and LP-Exp were weakly correlated suggest that these measures re- 
flect different aspects of the broader construct of health literacy. 
Several limitations should be noted. First, while our patient sam- 
ple was large and diverse, and we studied a very large number of 
patients and secure messages, we only were able to analyze those 
patients who had engaged in secure message with their physicians, 
likely excluding patients with severe health literacy limitations. 
However, in a related analysis (previously unpublished data), we 
found that patients with limited health literacy are accelerating in 
their use of patient portals and secure messaging relative to those 
with adequate HL. Based on the DISTANCE cohort, between 2006 
and 2015, the proportion of those with inadequate health literacy 
who used the portal to engage in two or more secure message 
threads increased nearly ten-fold (from 6% to 57%), as compared 
to -a five-fold increase among those with adequate health literacy 
(13% to 74%). By 2018, 99 percent of those with both limited and 
adequate health literacy who had registered for the portal had used 
the portal for secure messaging. Furthermore, we found no signifi- 
cant health literacy-related differences in exclusions at the patient 
or secure message level as shown in the flow diagram in Figure 1. 
Second, the setting in which we carried out this research raises 
questions about external validity. While limited health literacy is 
more concentrated in safety net health care settings, it is still com- 
mon in this fully insured population. KPNC has a sizable Medicaid 
population and over 1/3 of their diabetes patients have limited 
health literacy.2+°? From an internal validity standpoint, this setting 
provided access to a mature patient portal and availability of exten- 
sive linguistic and health-related data, and the fully integrated care 
and closed pharmacy system of KPNC ensured complete capture of 
health care utilization and medication refills. Relatedly, we excluded 
proxy secure messages (ie, those written by another individual on 
behalf of the patient) to enhance accuracy and limited the study to 
secure messages written in English. Third, the single items CAHPS 
measure is a subjective measure of provider communication and is 
subject to recall bias similar to that of self-reported health literacy, 
potentially over- or underestimating the strength of the association 
between LP_SR and provider communication. Fourth, although our 
literacy profiles were trained on self-reported health literacy and 
expert-rated health literacy, the absence of a universally accepted, 


comprehensive, “true” gold standard for health literacy, and the fact 
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that we used linguistic indices validated before email exchange be- 
came so prevalent, may limit our categorization of health literacy. 
Finally, additional research in other settings and patient populations 
with different conditions may provide a more definitive answer as to 
the optimal literacy profile to use in classifying health literacy. Our 
current work to develop and evaluate a measure of discordance in 
secure message exchange that takes into account both patients’ and 
physicians’ linguistic complexity may provide further insights. 

The ECLIPPSE Project set out to harness secure messages sent by 
diabetes patients to their primary care physician(s) to develop literacy 
profiles that can identify patients with limited health literacy in an 
automated way that avoids time-consuming and potentially sensitive 
questioning of the patient. Given the time and personnel demands 
intrinsic to current health literacy instruments, measuring health 
literacy has historically been extremely challenging. An automated 
literacy profile could provide an efficient means to identify subpop- 
ulations of patients with limited health literacy. Identifying patients 
likely to have limited health literacy could prove useful for alerting 
clinicians about potential difficulties in comprehending written and/ 
or verbal instructions. Additionally, patients identified as having lim- 
ited health literacy could be supported better by receiving follow-up 
communications to ensure understanding of critical communications, 
such as new medication instructions, and promote adherence and 
increased shared meaning.’? As such, our research to develop auto- 
mated methods for health literacy assessment represents a signifi- 
cant accomplishment with potentially broad clinical and population 
health benefits in the context of health services delivery. 

However, there may be privacy and ethical issues that research- 
ers and health systems planners need to consider before employing 
the automated literacy profiles for a new generation of health liter- 
acy research, or for scaling health system and clinical applications to 
reduce health literacy-related disparities. To generate literacy pro- 
files, patients’ own written words are harnessed, raising potential 
concerns about confidentiality. Further, having one's own linguistic 
data analyzed to estimate individual-level health literacy in the ab- 
sence of an explicit consent process may be perceived as problematic 
given the prior literature on literacy screening and stigma.°!®? The 
fact that electronic health data—both clinical and administrative—are 
commonly used by patients' health systems to identify populations 
and individuals at risk and to target associated interventions, both 
at the clinician-patient dyad and health system-population levels, 
suggests that efforts to introduce the literacy profile methodology 
could be met with acceptance, based on these precedents. Further, 
the fact that literacy profiles are generated using computational lin- 
guistic methods that require no human engagement with the actual 
content of the messages can provide additional reassurance to the 
public. Nevertheless, health systems interested in employing liter- 
acy profiles should consider adding linguistic data to the patient-re- 
lated electronic health data for which they already obtain blanket, 
advanced informed consent. In addition, researchers and health 
systems should develop policy guidance that permits usage of the 
literacy profiles that promote population health and reduce health lit- 


eracy-related disparities while not undermining patient well-being, as 
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well as practical guidance for clinicians as to how they might use the 
literacy profile in patient-centered and sensitive ways.°> Developing 
this guidance would benefit from the inclusion of, and input from, 
advisory members who have limited health literacy skills. 

In summary, the two top-performing literacy profiles (LP_SR, 
LP_Exp) revealed associations consistent with previous health liter- 
acy research across a range of outcomes related to quality, safety, co- 
morbidity, and utilization. Future implementation and dissemination 
research is needed. This research should include evaluating the trans- 
portability of our approach to deriving literacy profiles from patients’ 
secure messages to diverse health care delivery settings, the devel- 
opment of provider workflow and/or novel population management 
approaches when patients with limited health literacy are identified, 
and the effects of interventions that harness this novel source of in- 
formation on health-related outcomes. We conclude that applying 
innovative NLP and machine learning approaches!*195+5°8 to gener- 
ate literacy profiles from patients’ secure messages is a novel, feasible, 
automated, and scalable strategy to identify patients and subpopu- 
lations with limited health literacy, thus providing a health IT tool to 
enable tailored communication support and other targeted interven- 


tions with the potential to reduce health literacy-related disparities. 
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